Efficient Factorization of Synchronous Context-Free Grammars

نویسندگان

  • Hao Zhang
  • Daniel Gildea
چکیده

Factoring a Synchronous Context-Free Grammar into an equivalent grammar with a smaller number of nonterminals in each rule enables more efficient strategies for synchronous parsing. We present an algorithm for factoring an n-ary SCFG into a k-ary grammar in time O(kn). We also show how to efficiently compute the exact number of k-ary parsable permutations of length n, and discuss asymptotic behavior as n grows. The number of length n permutations that are k-ary parsable approaches a fixed ratio between successive terms as n grows for fixed k. As k grows, the difference between successive ratios approaches 1/e. The University of Rochester Computer Science Department supported this work.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Factorization of Synchronous Context-Free Grammars in Linear Time

Factoring a Synchronous Context-Free Grammar into an equivalent grammar with a smaller number of nonterminals in each rule enables synchronous parsing algorithms of lower complexity. The problem can be formalized as searching for the tree-decomposition of a given permutation with the minimal branching factor. In this paper, by modifying the algorithm of Uno and Yagiura (2000) for the closely re...

متن کامل

Factoring Synchronous Grammars by Sorting

Synchronous Context-Free Grammars (SCFGs) have been successfully exploited as translation models in machine translation applications. When parsing with an SCFG, computational complexity grows exponentially with the length of the rules, in the worst case. In this paper we examine the problem of factorizing each rule of an input SCFG to a generatively equivalent set of rules, each having the smal...

متن کامل

Synchronous Context-Free Tree Grammars

We consider pairs of context-free tree grammars combined through synchronous rewriting. The resulting formalism is at least as powerful as synchronous tree adjoining grammars and linear, nondeleting macro tree transducers, while the parsing complexity remains polynomial. Its power is subsumed by context-free hypergraph grammars. The new formalism has an alternative characterization in terms of ...

متن کامل

Joshua 3.0: Syntax-based Machine Translation with the Thrax Grammar Extractor

We present progress on Joshua, an opensource decoder for hierarchical and syntaxbased machine translation. The main focus is describing Thrax, a flexible, open source synchronous context-free grammar extractor. Thrax extracts both hierarchical (Chiang, 2007) and syntax-augmented machine translation (Zollmann and Venugopal, 2006) grammars. It is built on Apache Hadoop for efficient distributed p...

متن کامل

Efficient Multi-pass Decoding for Synchronous Context Free Grammars

We take a multi-pass approach to machine translation decoding when using synchronous context-free grammars as the translation model and n-gram language models: the first pass uses a bigram language model, and the resulting parse forest is used in the second pass to guide search with a trigram language model. The trigram pass closes most of the performance gap between a bigram decoder and a much...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006